Mitigating Adversarial Norm Training with Moral Axioms
نویسندگان
چکیده
This paper addresses the issue of adversarial attacks on ethical AI systems. We investigate using moral axioms and rules deontic logic in a norm learning framework to mitigate training. model intuition construction provides systems with guard rails yet still allows for conventions. evaluate our approach by drawing inspiration from study commonly used development research. questionnaire aims test an agent's ability reason conclusions despite opposed testimony. Our findings suggest that can correctly situations learn conventions training environment. conclude adding axiomatic prohibitions inference makes it less vulnerable attacks.
منابع مشابه
Mitigating Unwanted Biases with Adversarial Learning
Machine learning is a tool for building models that accurately represent input training data. When undesired biases concerning demographic groups are in the training data, well-trained models will reflect those biases. We present a framework for mitigating such biases by including a variable for the group of interest and simultaneously learning a predictor and an adversary. The input to the net...
متن کاملTowards Mitigating Audio Adversarial Perturbations
Audio adversarial examples targeting automatic speech recognition systems have recently been made possible in different tasks, such as speech-to-text translation and speech classification. Here we aim to explore the robustness of these audio adversarial examples generated via two attack strategies by applying different signal processing methods to recover the original audio sequence. In additio...
متن کاملMitigating adversarial effects through randomization
Convolutional neural networks have demonstrated their powerful ability on various tasks in recent years. However, they are extremely vulnerable to adversarial examples. I.e., clean images, with imperceptible perturbations added, can easily cause convolutional neural networks to fail. In this paper, we propose to utilize randomization to mitigate adversarial effects. Specifically, we use two ran...
متن کاملAxioms for the Norm Residue Isomorphism
We give an axiomatic framework for proving that the norm residue map is an isomorphism (i.e., for settling the motivic Bloch-Kato conjecture). This framework is a part of the Voevodsky-Rost program.
متن کاملAdversarial Source Identification Game with Corrupted Training
We study a variant of the source identification game with training data in which part of the training data is corrupted by an attacker. In the addressed scenario, the defender aims at deciding whether a test sequence has been drawn according to a discrete memoryless source X ∼ PX , whose statistics are known to him through the observation of a training sequence generated by X . In order to unde...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
ژورنال
عنوان ژورنال: Proceedings of the ... AAAI Conference on Artificial Intelligence
سال: 2023
ISSN: ['2159-5399', '2374-3468']
DOI: https://doi.org/10.1609/aaai.v37i10.26402